The GlottHMM Entry for Blizzard Challenge 2011: Utilizing Source Unit Selection in HMM-Based Speech Synthesis for Improved Excitation Generation

نویسندگان

  • Antti Suni
  • Tuomo Raitio
  • Martti Vainio
  • Paavo Alku
چکیده

This paper describes the GlottHMM speech synthesis system for Blizzard Challenge 2011. GlottHMM is a hidden Markov model (HMM) based speech synthesis system that utilizes glottal inverse filtering for separating the vocal tract and the glottal source from speech signal and models both components individually. In this year’s entry, stabilized weighted linear prediction (SWLP) is used to yield more robust estimates of the vocal tract filter of the high-pitched female voice. After the inverse filtering, the resulting source signal is parameterized into excitation features and a glottal flow pulse library, consisting of the variety of different glottal flow pulses. In the synthesis stage, a unit selection scheme is used for reconstructing the source signal: by minimizing the target and concatenation costs, best matching glottal flow pulses are selected from the pulse library in order to create a natural voice source. Finally, speech is synthesized by filtering the excitation signal by the vocal tract filter.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The GlottHMM Speech Synthesis Entry for Blizzard Challenge 2010

This paper describes the GlottHMM speech synthesis entry for Blizzard Challenge 2010. GlottHMM is a hidden Markov model (HMM) based speech synthesis system that utilizes glottal inverse filtering for separating the vocal tract from the glottal source. The source and the filter characteristics are modeled separately in the framework of HMM. In the synthesis stage, natural glottal flow pulses are...

متن کامل

The GlottHMM Entry for Blizzard Challenge 2012: Hybrid Approach

This paper describes the GlottHMM speech synthesis system for Blizzard Challenge 2012. The aim of the GlottHMM system is to combine high-quality vocoding and detailed prosody modeling in order to produce expressive, high quality, synthetic speech. GlottHMM is based on statistical parametric speech synthesis, but it uses a glottal flow pulse library for generating the excitation signal. Thus, it...

متن کامل

NICT Blizzard Challenge 2010 Entry

This paper details a speech synthesis system developed at NICT for the Blizzard Challenge 2010. The system depends on an HMM-based speech synthesis technique that possesses two distinctive features: HMM training under global-variance constraint on the parameter trajectory and trainable mixed excitation for source-filter vocoding. For this year’s entry, we added some modifications to the system ...

متن کامل

The NTNU Concatenative Speech Synthesizer

This paper describes NTNU’s entry for the Blizzard Challenge 2010. Our system is a conceptually simple variation of an HMM-based unit selection system, which uses diphones as the basic unit and employs a combined selection of units and their join points. The evaluation results of the Blizzard Challenge 2010 show that the system performs well when compared with the other systems.

متن کامل

The USTC System for Blizzard Challenge 2014

This paper introduces the speech synthesis system developed by USTC for Blizzard Challenge 2014. Six Indian languages were evaluated this year, including Assamese, Gujarati, Hindi, Rajasthani, Tamil and Telugu. Two tasks were built for these languages: the mono-lingual task (IH1 hub task) and the multi-lingual task (IH2 spoken task). We submitted entries to both tasks in all languages. We submi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011